Modeling of Log Kow of a Series of PAHs using Computational Chemistry
Fatiha Mebarki1, Souhaila Meneceur2, Abderrhmane Bouafia2*
1Department of Material Sciences, Faculty of Science and Technology, Amine Elokkal Elhadj
Moussa Eg Akhamouk University – Tamanrasset 11000, Algeria.
2Process Engineering Department, Faculty of Technology, Hamma Lakhdar University-El oued 3900, Algeria.
*Corresponding Author E-mail: abdelrahmanebouafia@gmail.com
ABSTRACT:
The importance of Chemometrics Methods in
Modeling (in QSAR analysis) of the mathematical model’s study of large datasets
of molecules with huge numbers of physicochemical and structural parameters
quantitative structure-Toxicity relationship (QSTR) are mainly based on
multiple regression analysis in QSAR analysis The study of Least Square in
deriving QSTR models for datasets of Quantitative Structure-Toxicity
relationship on Log kow (Octanol-water partition
coefficient) for 16 Hydrocarbons compounds has been using the software
Hyperchem 6.3 for computing descriptors and MINITAB 16 for data modeling. A three -descriptors model [two
electronics molecules’ descriptors (QSER descriptor), HOMO (is Highest occupied
molecular orbital) and LUMO (Lowest unoccupied molecular orbital), one QSAR
descriptor
(Hydration Energy) by Least Squares with
correlation coefficient r=0.868, S=0.635, R2 =
75.4%, R2ajd=73.7% and Durbin-Watson statistic =1.85277
and graphical analysis by diagram of goodness of fit and line plot. The results
statistical of new model after removing the aberrant compounds (Toxicity
compounds) shows high Coefficient of correlation r=0.9581, S=0.4316,
determination coefficient R2
=91. 8%, ajustemed R2ajd = 89.3%, Durbin-Watson statistic D=2.373,
Three explanatory Variable model selected is robust and has good fitness.
Two influential compounds detected and important the model and absence aberant
compounds of the studied sample.
Among modern methods is statistical and analysis graphical analysis (in QSAR analysis) by mathematical algorithms6-10.
In this study, a relationship was evaluated between
molecular structure and toxicity coefficient
of 16 cyclic aromatic hydrocarbon compounds
by multiple regression method to know the most toxic compounds when abnormal
points (aberant) appear using line plot and how to distribute them using
goodness-of fit graph10-15. Finally, these points are removed to
know the effect of these compounds on improving themodel15-22.
METHODOLOGY:
The Data Set:
The n-octanol: water partition coefficient
(Log
) of 16 hydrocarbons compounds22-27,
taken from the scientific article of FrédéricJouannin28-29. The list
of Compounds and experimental value of Log
is shown in Table 1.
Table 1. Octanol: water partition
coefficient (log
) for the selected chemical compounds of
Polycyclic aromatic hydrocarbons (PAHs).
|
S. No° |
Compounds |
Log |
|
1 |
Naphtaléne |
3.37 |
|
2 |
Acenaphtyléne |
4.07 |
|
3 |
4.33 |
|
|
4 |
Fluorene |
4.18 |
|
5 |
Phenanthene |
4.46 |
|
6 |
Anthracene |
4.45 |
|
7 |
Fluoranthene |
5.33 |
|
8 |
Pyrene |
5.32 |
|
9 |
Benzo(a)anthracene |
5.61 |
|
10 |
Chrysene |
5.61 |
|
11 |
Benzo(b)fluoranthene |
6.57 |
|
12 |
6.84 |
|
|
13 |
Benzo(a)pyrene |
6.04 |
|
14 |
5.97 |
|
|
15 |
Indeno (c, d) pyrene |
7.66 |
|
16 |
Benzo (g, h, i) perylene |
7.23 |
Selection of descriptors:
The detailed
general formular for a series of cyclic hydrocarbons was obtained by drawing
and placing them in a stable geometric position using the semi - empirical
method (PM3) (Geometry optimization) based on a Hyperchemsoftware33,
including obtaining electronic propriety (HOMO, LUMO, E tot,Enucle
,E buiding, Estab
)and QSAR propriety (Log p Refractivity, Volume ,polarization ,M,…)29-35.
Then we analyse
the relationship of each descriptor with
using MINITAB 16 software to choose the best model in term of the
highest value of determination coefficient
and the smallest standard deviation (S).
Development of the model:
In this study work, it was relied on mathematical computer software through multi -variable equation
or linear equations of the multi order in their general form:
F(x) = a + b1x1 + b2x2+ b3x3 (1)
a:constant of regression
b1, b2 and b3:these coefficients of regression).
Statistical analysis of the model based on
the following factors:(R 2, R 2 adj,
, r, standard error S, D, VIF, P)34.
r is correlation coefficients:
(2)
For the signific test of correlation coefficients are using;
(3)
t is student value of ![]()
R2 is determination coefficients
(4)
(5)
R 2adj, is Adjusted R 2:
(6)
VIF is the variance inflation factors:
(7)
The p-value is the probability of obtaining a test statistic that is at least as extreme the as actual calculated value, if the null hypothesis is true.
DW is Durbin-Watson statistic:
RESULT AND DISCUSSION:
The best model:
(
): HOMO (Highest occupied molecular
orbital), EH (Hydration Energy) and LUMO (Lowest unoccupied molecular orbital)
n=16 compound.
Statistical and graphical analysis:
All indicators
of the theoretical sample selected in Table 2within the statistical conditions
for accepting the model were The standard deviation S=0.635<1(S≪) R2=80.2%
,probability p=0.000
,the variance inflation factor VIF=1.400
,Durbin Watson statistic D=1.90077
and free from any statical problem.
Table 2. Diagnostic statistical sample.
|
S |
R 2 |
R 2(adj) |
R 2(préd) |
|
0.616007 |
80.2% |
75.3% |
75.3% |
|
P |
VIF |
DW |
|
|
0.00 |
1.400 |
1.90077 |
|
The model based on three descriptors is for equation using the Minitab 16 software 34:
(9)
t student value for model is:
=4.608(acceptthe model).
Relationship between Y and X:
The matrix of the compatibility
relationship or the proportions shown in the table 3 and figure 2 . positive direct proportion(r=0.8955)
between HOMO and Log
and negative
direct proportion between two descriptors EH, LUMO) and ![]()
Using test of student for correlation
coefficient of the model
(14 ;0.05) =2.1448<
=6.54. The correlation coefficient has
statistically significant and was not the result of chance, but rather
represents the reality of the strength of the relationship at Level % 95.
Table 3. Relationshipmatrix
|
Log kow |
HOMO |
EH |
|
|
HOMO |
0,868 |
||
|
0.000 |
|||
|
EH |
-0,465 |
-0,361 |
|
|
0,07 |
0,170 |
||
|
LUMO |
-0,348 |
-0,471 |
0,482 |
|
0,186 |
0,062 |
0,059 |
Figure 1. Histogram of coefficient correlation.
The goodness-of-fit:
As shown in Figure 3 that the points are far apart, but within the field between the lines of the continuous confidence bands around the regression for the 95th percentiles, the distribution of points can be modified by the equation39-49.
As shown in Figure 4 that it right of
adjustment y=f(x) is a line linear, the y expeimental and Y calculated values are very close. This model fits well the
y calcu data (R² = 100%) and
=100% between the y experimental and Y
Calculated.
The Relationship between Log
and HOMO is statisticaly significant
(p<0.05)
Figure 2. diagram of the goodness-of-fit.
2(Acénaphtylene),3(Acenaphtene)
12 (Benzo (k)fluoranthene) has a major
power (
>
=0.52), and it is important.
Removing data that are associated with special cause and redoing the analysis
Figure 3. Line plot of the MLR model.
After two redoing the analysis for removing the aberrantcompounds:
7(Fluoranthene),14(Dibenzo(a,h)anthracene)Thesecompounds are power toxicity compounds because not well fit by the equation.
New model is Very good statistical value: The standard
deviation S =0.431636
(
),
,
=89,3
=79.75
.
Probability p=0.000
, the variance inflation factor VIF=1.310
,Durbin Watson statistic D=2.37329
(
)his adjustment power (Table 4).
Table 4: Diagnostic statistical Sample (after removing the aberrant compounds).
|
S |
R-sq |
R-sq(adj) |
R-sq(prév) |
|
0,431636 |
91,8% |
89.3% |
79.75% |
|
P |
VIF |
DW |
|
|
0.000 |
1.31 |
2.37329 |
|
Removing the aberrant compounds).
(10)
The parfaithly of The positive correlation (r=0.9581)
in Table 5 and figure 4 indicates that when HOMO increases, Log
also tends to increase and the negative
correlation by two descriptors (
=-0.455,
).
Table 5; Correlation matrix (After removing the aberrant compounds).
|
|
Log kow |
HOMO |
EH |
|
HOMO |
0,947 |
||
|
0.000 |
|||
|
EH |
-0,455 |
-0,356 |
|
|
0,102 |
0,212 |
||
|
LUMO |
-0,342 |
-0,38 |
0,472 |
|
|
0,231 |
0,181 |
0,089 |
Figure 4: Histogram of coefficient correlation (After removing the aberant compounds).
Figure (5) has a n symetry distribution from
coefficient of Skewness(-0.1451
) which defined by
=(3*(x ̅-Med)/s) (when
,Median =0.0225) and coefficient of
Kurtosis (-0.711
)also the coefficient of Skewness with the
coefficient of Kurtosis Jarque and Bera test give which presented like of
Normality test of Residues has two degrees of freedom (JB test= 0.343=n
5.99)indicates that the resting pulse data
follow a normal distribution .
The mean of the students’ resting pulse is 0 (95% confidance intervals of -0.2185and0.2185).
The standard deviation is 0.3785(95% confidance intervals of 0.274and0.690).
Using a significance level of 0.05 ,the Anderson
-Darling normality test (A-squared=0.20
)indicates that the resting puls data
follow a normality distribution.
The boxplot shows: 1st Quartie is -0.3264,Median is 0.0225 (95% confidance intervals of 0.305 and 0.193), 3rd Quartie is 0.2359 maximum is 0.616 and no outhiers compounds (aberant) are present.
Figure 5. Summary of RESI (After removing the aberrant compounds).
A normal distribution in Figure 6
Figure 6. diagram of the goodness-of-fit. (After removing the aberrant compounds)
There are no abnormal data compounds
(aberrant) in figure 7. Abnormal data compounds can have a healthy power on the
consequences(score):2(Fluorene) ,3 (Acenaphtene)has a
major power (
) and it is important.
Figure 7. Line plot of the MLR model (After removing the aberrant compounds)
Interpretation of the mode:
The hydrophobicity is expressed by the
octanol-water partition coefficient (
); which estimate the solubility in both
aqueous and organic phases (in general n-octanol-water is used)27.
Three descriptors were able to model the Octanol-Water
Partition Coefficient. The value of coefficient by electronic descriptor
HOMO (4.41) in equation (9) for the correlation coefficient (
= 0.868) in table 3show the regularity of
the positive impact of this descriptor to the value of
.
The value of coefficient by two descriptors
(QSAR descriptor EH (-0.509) and electronic descriptor (QSER
descriptor) LUMO (0.337)) in equation (9) for the correlation coefficient (
=-0.465 and
= -0.348) in table 3indicate the negative
correlation of these descriptors to the value of ![]()
The value of coefficient by the electronic
descriptor HOMO after removing the aberrant points (5.22) in equation (10) for
the correlation coefficient (r=0.947) in table 5show the regularity of positive
impact of this descriptor to the value of
.
The value of two descriptors (QSAR
descriptor EH (-0.362) and electronic descriptor (QSER descriptor)
LUMO (0.170) in equation (10) for the correlation coefficient (
=-0.455 and
=-0.345) in table 5indicate the negative
correlation of these descriptor to the value of
.
CONCLUSION:
The octa-water constants (Log
) of 16 Polycyclic aromatic hydrocarbons
(PAHs). Among three descriptors (two electronic descriptor (QSER) HOMO is
Highest occupied molecular orbital, LUMO is lowest unoccupied molecular orbital
and one QSAR descriptor EHis Hydration Energy selected to model the
n-Octanol: water partition coefficient (Log
) by set of 16 Polycyclic aromatic
hydrocarbons (PAHs) on a linear model of regression used the method of least
squares.
The Multilinear model with three variablespresent is robust, and a good quality of fit after removing the aberrant points:
7(Fluoranthene),14(Dibenzo (a,h) anthracene)
and good result statistic of r=0.9581for the retention time, S = 0.4316R2
= 91.8%
= 89.3%, Durbin-Watson statistic =
2.373.
2(Fluorene),3(Acenaphtene)is influential and important the model and no aberrant point.
CONFLICTS OF INTEREST:
The author declare that is no conflictof interest.
REFERENCES:
1. Polynuclear aromatic hydrocarbons (PAH). In: Air quality guidelines for Europe. Copenhagen, World Health Organization Regional Office for Europe, 1987pp. 105–117.
2. Marija Baranac-Stojanović. 2021. Revival of Hückel Aromatic (Poly)benzenoid Subunits in Triplet State Polycyclic Aromatic Hydrocarbons by Silicon Substitution. Asian Journal of Research in Chemistry https://doi.org/10.1002/asia.202101261.
3. Agency for toxic substances and disease registry. Toxicological profile for polycyclic aromatic hydrocarbons (PAHs): update. Atlanta, GA, US Department of Health and Human Services, Public Health Services.1994.
4. Kohei Fuchibe.2020. Fluorinated Phenanthrenes as Aryne Precursors: PAH Synthesis Based on Domino Ring Assembly Using 1,1-Difluoroallenes. Asian Journal of Research in Chemistry https://doi.org/10.1002/asia.202000069
5. Masato Honda and Nobuo Suzuki. 2020. Toxicities of Polycyclic Aromatic Hydrocarbons for Aquatic Animals.Int. J. Environ. Res. Public Health,17, 1363; doi:10.3390/ijerph17041363.
6. Hilal Colak.2020. Investigation of Polycyclic Aromatic Hydrocarbons in Foods Asian Journal of Research in Chemistry .22(8):5797-5807.
7. Hylland, K.2006. Polycyclic aromatic hydrocarbon (PAH) ecotoxicology in marine ecosystems. J. Toxicol. Environ., 69, 109–123.
8. Kasama Janvijitsakul*, Vladimir I. Kuprianov 2007. Polycyclic Aromatic Hydrocarbons in Coarse Fly Ash Particles Emitted from Fluidized-bed Combustion of Thai Rice Husk. Asian Journal on Energy and Environment, 08(04), 654-662. ISSN 1513-4121.
9. Collins, J.F.; Brown, J.P.; Alexeeff, G.V.; Salmon, A.G. 1998.Potency equivalency factors for some polycyclic aromatic hydrocarbons and polycyclic aromatic hydrocarbon derivatives. Regul. Toxicol. Pharmacol. 28, 45–54.
10. Monago, C. C., Nwiko, E. B., Chuku, L. C.2010. Assessment of Polycyclic Aromatic Hydrocarbon Levels in Blood of Refinery Workers in Nigeria. Asian Journal of Research in Chemistry Volume - 3(3).
11. Devi, N.L.; Yadav, I.C.; Shihua, Q.; Dan, Y.; Zhang, G.; Raha, P.2016. Environmental carcinogenic polycyclic aromatic hydrocarbons in soil from Himalayas, India: Implications for spatial distribution, sources apportionment and risk assessment. Chemosphere. 144, 493–502
12. Kuo, C.-Y.; Cheng, Y.-W.; Chen, Y.-W.; Lee, H. 1998.Correlation between the amounts of polycyclic aromatic hydrocarbons and mutagenicity of airborne particulate samples from Taichung City, Taiwan. Environ. Res. 78, 43–49.
13. Rengarajan, T.; Rajendran, P.; Nandakumar, N.; Lokeshkumar, B.; Rajendran, P.; Nishigaki, I. 2015.Exposure to polycyclic aromatic hydrocarbons with special focus on cancer. Asian Pac. J. Trop. Biomed.vol 5, 182–189.
14. Bekki, K.; Toriba, A.; Tang, N.; Kameda, T.; Hayakawa, K.2013. Biological effects of polycyclic aromatic hydrocarbon derivatives. J. UOEH, 35, 17–24.
15. Ikenaka, Y.; Oguri, M.; Saengtienchai, A.; Nakayama, S.M.; Ijiri, S.; Ishizuka, M. 2013.Characterization of phase-II conjugation reaction of polycyclic aromatic hydrocarbons in fish species: Unique pyrene metabolism and species specificity observed in fish species. Environ. Toxicol. Pharmacol. 36, 567–578.
16. Jacob, J. 2008.The significance of polycyclic aromatic hydrocarbons as environmental carcinogens. 35 years research on PAH—A retrospective. Polycycl. Aromat. Compd. vol 28, 242–272.
17. Jørgensen, A.; Giessing, A.M.; Rasmussen, L.J.; Andersen, O.2008. Biotransformation of polycyclic aromatic hydrocarbons in marine polychaetes. Mar. Environ. Res. vol 65, 171–186.
18. Worth, AP, Bassan, A, De, Bruijn, J, Gallegos Saliner, A, Netzeva, T, Patlewicz, G, Tsakovska, L and Eisenreich, S.2007. The role of the European Chemicals Bureau in promoting the regulatory use of QSAR methods, SA Rand QSAR in Environmental Research, Vol 18 Nos 1-2,pp111-125.
19. Hansh, C, Maloney, PP, Fujita, T, and Muir, RM.1962. Correlation of biological activity of phenoxyacetic acids with Hammett substituent constants and partition coefficient, Nature Vol 194, pp.80-178.
20. Bordhar, M, Ghasemi, J, Fall, A, Y, and Fazaeli, R.2013.Chemometric modeling to predict aquatic toxicity of benzaene derivatives using stepwise multi linear and partial least square.Assia Journal of Chemistry, Vol 25 No .1,pp.331-342.
21. D. Mackay, W.Y. Shiu, K.C. Ma,.1998. Illustrated Handbook of Physical–Chemical Properties and Environmental Fate for Organic Chemicals, vol. 3, Lewis, London.
22. H.B. Krop, J.M. van Velzen, J.J.R. Parsons, H.A.J. Goves, 1997Chemosphere 34, 107.
23. H. Kubinyi, QSAR: Hansch1993.Analysis and Related Approaches, VCH, Weinheim.
24. S.G. Huling, J.W. Weaver, Dense nonaqueous phase liquids. Ground water issue. EPA/540/4-91-002, US EPA. R.S. Kerr Environmental Research Laboratory, Ada, OK, p. 21.
25. C.J. Newell, S.D. Acree, R.R. Ross, S.G. Huling, Light nonaqueous phase liquids. Ground water issue. EPA/540/S-95/ 500, US EPA. R.S. Kerr Environmental Research Laboratory, Ada, OK, p. 28.
26. Robert A. Yaffee.2002.Robust Regression Analysis:Some Popular Statistical Package Options..ReacGate.https://www.researchgate.net/publication/266436942
27. Eriksson, L., Jaworska, J., Worth, A., Cronin, M., Mc Dowell, R.M., Gramatica, P. (2003). Methods for reliability, uncertainty assessment, and applicability evaluations of regression based and classification QSPRs. Environmental Health Perspective Journal, 111(10): 1361-1375. https://doi.org/10.1289/ehp.5758.
28. Tropsha, A., Gramatica, P., Grombar, V.K. (2003). The importance of being Earnest: Validation is the absolute essential for successful application and interpretation of QSPR models. QSAR &Combinatorial Science, 22(1): 69-76. https://doi.org/10.1002/qsar.200390007.
29. Frédéric Jouannin.2004.Etude de la mobilité des hydrocarbures aromatiques polycycliques (HAP) contenus dans un sol industriel pollué. these de doctorat.L’Institut National des Sciences Appliquées de Lyon paris.
30. Draper, N.R., Smith, H. (1998). Applied Regression Analysis. Third Edition. Wiley Series in Probability and Statistics. New York.
31. Ribeiro FAL, Ferreira MMC.2003. QSPR models of boiling point, octanol-water partition coefficient and retention time index of polycyclic aromatic hydrocarbons. Journal of Molecular Structure: THEOCHEM; 663(5): 109-126.
32. FabianaA and Marcia M.2003.QSPR models of boiling point,octanol -water partition coefficient and retention time index of polycyclic aromatic hydrocarbons
33. HyperchemTM Release 6.03 for Windows, Molecular Modelling System, 2000.
34. MINITAB, Release 13.31, Statistical Software, 2000.
35. Ramsay, L.F., Schafer, W.D. (1997). The statistical sleuth, Belmont: Wadsworth Publishing Company.
36. A. Boudehane, S. Atia, L. Kribaa, A. Lounas, C. Balducci, A. Cecinato 2020. Detecting some VOCs (PAH) in hospital and university in Ouargla city Algeria Asian Journal of Research in Chemistry Volume 13(1),. DOI: 10.5958/0974-4150.2020.00004.8
37. Shuchi Gupta, Seema Acharya.2018. Spectrofluorimetric Analysis of Interaction of Benzo (a) Pyrene and Surfactant Micelles. Asian Journal of Research in Chemistry. Volume – 11(3). DOI: 10.5958/0974-4150.2018.00113.X
38. Kadhiravansivasamy, S. Sivajiganesan, T. Periyathambi, V. Nandhakumar, M. Pugazhenthi. 2017.Synthesis and Characterization of Schiff Base CoII, NiII and CuII Complexes derived from 2-Hydroxy-1-naphthaldehyde and Ethylenediamine. Asian Journal of Research in Chemistry Volume– 10(2),. 10.5958/0974-4150.2017.00016.5.
39. Anthony J. Burke, Carla S. S. Teixeira, Sérgio F. Sousa 2022..Transformation of a Chiral Glycolic Acid to an Isoaurone: Stereochemical Assignment of a Benzilic Acid Rearrangment Product. Asian Journal of Organic Chemistry. DOI: 10.1002/ajoc.202100692.
40. Buralla KK, Parthasarathy V. Central composite design based development and validation of an rp-hplc method for paclitaxel in bulk and pharmaceutical dosage form. Research Journal of Pharmacy and Technology. 2020;13(10):4895-902.
41. Fatmarahmi DC, Susidarti RA, Swasono RT, Rohman A. Identification and quantification of metamizole in traditional herbal medicines using spectroscopy FTIR-ATR combined with chemometrics. Research Journal of Pharmacy and Technology. 2021;14(8):4413-9.
42. Gandhi SV, Sonawane PS. Chemometric–Assisted UV Spectrophotometric Method for Determination of CefiximeTrihydrate and Cloxacillin Sodium in Pharmaceutical Dosage Form. Asian Journal of Research in Chemistry. 2018;11(4):705-9.
43. Jaiswal S, Chavhan SA, Shinde SA, Wawge NK. New Tools for Herbal Drug Standardization. Asian J Res Pharm Sci. 2018;8(3):161-9.
44. Mathew C, Varma S. Green Analytical Methods based on Chemometrics and UV spectroscopy for the simultaneous estimation of Empagliflozin and Linagliptin. Asian Journal of Pharmaceutical Analysis. 2022;12(1).
45. Nangare SA, Rohane SH. Review on Guassion, the General Purpose in Computational Chemistry for Medicinal Chemistry. Asian Journal of Research in Chemistry. 2021;14(1):89-91.
46. Patel K, Shah UA, Joshi HV, Patel JK, Patel CN. QbD Stressed Development and Validation of Stability-Indicating RP-HPLC Method for the Simultaneous Estimation of Linagliptin and Metformin HCl in Pharmaceutical Dosage Form. Research Journal of Pharmacy and Technology. 2022;15(5):1917-23.
47. Rao MVS, Ramam VA, Rao VM, Rao RS. Singular Value Decomposition is used as a Chemo metric Tool for Kinetic Investigations. Research Journal of Science and Technology. 2013;5(4):412-20.
48. Sankar ASK, Vetrichelvan T, Venkappaya D, Nagavalli D, Divya O. Simultaneous Estimation of Ramipril, Aspirin and Atorvastatin Calcium by Classical Least Squares Regression in Capsule Dosage Form. Research Journal of Pharmacy and Technology. 2011;4(3):398-401.
49. Shiyan S, Arifin A, Amriani A, Pratiwi G. Immunostimulatory activity of ethanol extract from Calotropisgigantea L. flower in rats against Salmonella typhimurium infection. Research Journal of Pharmacy and Technology. 2020;13(11):5244-50.
Received on 23.06.2022 Modified on 31.07.2022
Accepted on 20.09.2022 ©AJRC All right reserved
Asian J. Research Chem. 2022; 15(6):443-448.